This document contains code to reproduce the visualizations of the simulation study in the manuscript. It also contains additional results not shown in the main manuscript.
sim_results <-readRDS(here::here("output", "sim_results.RDS"))# Create a cut version of the simulation results without MCMC diagnosticssim_res_cut <- sim_results |>select(-contains("rhat")) |>select(-contains("divtrans"))
3 Data-Generating Processes
These are the data generating matrices that we used for our fixed effects:
We had multiple minor deviations from our preregistration, which mostly occurred due to errors on our side. We do not think that any of these decreases the informativeness of our study.
4.0.1 Deviations from Preregistration
Deviation
Explanation
Updated BmlVAR formula to fix subscripts and notation
The preregistered model formula omitted some necessary subscripts and used inconsistent distributional notation. These were corrected for clarity and correctness in the final version.
Renamed centrality predictor from d to c
The label c was chosen for clarity.
Incorrect prior notation for second step regression residual variance
We accidentally noted a t-prior on \(log(sigma)\), but a half-t prior on \(\sigma\) was used in both steps for consistency across models and had already been implemented in the code. We fixed the notation to align with our actual implementation
Omitted sum over individuals in centrality selection accuracy formula
The preregistered formula did not include a summation over individuals, which is necessary for the proportion.
200 instead of 100 simulation repetitions
As we explained in the manuscript, we accidentally calculated that we needed 200 instead of the actual 100 simulation repetitions to achieve the desired worst-case MCSE. We kept the 200 repetitions as they are informative and increase precision.
5 Data Wrangling
Prepare dataframe for visualization by giving proper names, removing unnecessary columns, and pivoting longer:
Check if any errors that led to non-convergence occurred:
Show the code
# load results before resummarization, which contain potential error messagessim_full_pre_resum <-readRDS(here("output", "sim_full.rds"))sim_errors <-SimExtract(sim_full_pre_resum, "errors")
There were 0 error messages.
6.2 Warnings
We extract all warnings and rename them properly:
Show the code
sim_warnings <-SimExtract(sim_full_pre_resum, what ="warnings")sim_warnings |>select(!c(heterogeneity, strength_sd, outstrength_sd, instrength_sd)) |>rename("Deprecated dplyr function"="WARNING: c(\"Warning in NULL : `funs()` was deprecated in dplyr 0.8.0.\", \"Warning in NULL : Please use a list of either functions or lambdas: \\n\\n # Simple nam","Bulk ESSlow"="WARNING: Warning in NULL : Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.\nRunning the chains for more i","Potential Stan sampling problems"="WARNING: Warning in NULL : Examine the pairs() plot to diagnose sampling problems\n","Tail ESS low"="WARNING: Warning in NULL : Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.\nRunning the chains","NA R-hat"="WARNING: Warning in NULL : The largest R-hat is NA, indicating chains have not mixed.\nRunning the chains for more iterations may help. See\nhttps://mc-stan.org/mi","Divergent transition"="WARNING: Warning in NULL : There were 1 divergent transitions after warmup. See\nhttps://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup\nto find","Exceeded maximum treedepth"="WARNING: Warning in NULL : There were 497 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 10. See\nhttps://mc-stan.org","Rothman algorithm warning"="WARNING: Warning in Rothmana(data_l, data_c, lambdas$beta[i], lambdas$kappa[i], regularize_mat_beta = regularize_mat_beta, regularize_mat_kappa = regulariz","Variable naming warning"="WARNING: Warning in validityMethod(object) : The following variables have undefined values: reg_intercept_z[1],The following variables have undefined values: ") |> knitr::kable()
dgp
n_id
n_tp
Deprecated dplyr function
Bulk ESSlow
Potential Stan sampling problems
Tail ESS low
NA R-hat
Divergent transition
Exceeded maximum treedepth
Rothman algorithm warning
Variable naming warning
sparse
75
60
60
200
100
200
28
100
3
4759
35
sparse
200
60
0
200
1
199
15
1
0
12834
23
sparse
75
120
0
199
26
197
5
26
0
4555
2
sparse
200
120
0
200
0
163
3
0
0
12042
32
Unfortunately, we removed warnings from mlVAR during our simulation. However, in our additional simulations in 06_additional_mlvar_simulation.qmd, we noticed that warning messages from mlVAR were only related to the renaming of certain variables, so we assume that we did not miss any substantial warnings.
7 Point Estimates
Plot point estimate recovery (RMSE), helpful to understand the overall performance of the different methods.
plot_mostcentral <- sr_edit |># if mcse is missing, set to 0mutate(mcse =ifelse(is.na(mcse), 0, mcse)) |>mutate(outcome =factor(outcome, levels =c("Temporal", "Contemporaneous"))) |>filter(pm =="mostcent") |>mutate(n_tp =paste0("t = ", n_tp)) |>mutate(n_tp =factor(n_tp, levels =c("t = 60", "t = 120"))) |>mutate(n_id =paste0("n = ", n_id)) |>mutate(n_id =factor(n_id, levels =c("n = 75", "n = 200"))) |>ggplot(aes(x =as.factor(n_tp), y = mean,group = method,colour = method )) +# horizontal line at chance level of 1/6geom_hline(yintercept =1/6, linetype =2, alpha = .7)+# add vertical line between methodsgeom_vline(colour ="#F3F4F5", xintercept =seq(1.5, 4, 1))+geom_errorbar(aes(ymin = mean -1*mcse,ymax = mean +1*mcse),width = .5,position =position_dodge(0.8),show.legend =FALSE)+geom_point(position =position_dodge(0.8), size =1.4) +# add mean as text above errorbargeom_text(aes(label =sub("^(-?)0.", "\\1.",sprintf("%.2f", mean))),position =position_dodge(0.8),vjust =-1.0,size =4,show.legend =FALSE) + ggh4x::facet_nested(n_id ~ outcome,axes ="all",remove_labels ="y") +# scale_x_discrete(guide = guide_axis(angle = 90))+scale_y_continuous(expand =c(0, 0), limits =c(0,1.1)) +scale_color_manual(values = meth_colors) +theme_centrality() +theme(legend.position ="bottom",text =element_text(size =24), )+labs(title ="",x ="Time Points",colour ="",y ="Proportion of Correct Central Nodes")ggsave("plot_mostcentral.pdf", plot_mostcentral, height =6*1.1, width =9*1.1,path = here::here("figures/"), device ="pdf")plot_mostcentral
8.2 Plot rank correlation of centrality measures
We were interested in the performance of centrality estimation, which we assessed with regard to rank-order performance. As we expected centrality estimates to be biased downwards in some methods due to sparsity assumptions, we computed the rank-order consistency of individual centrality estimates. To do so, we computed the point estimate of network centrality \(\hat{c}\) per individual in a data set, which ignored estimation uncertainty. We then estimated the Spearman rank correlation \(\hat{\rho}_{i}\) in repetition \(i\) of these estimates with the true network centrality \(c\) and calculated its average across repetitions as: \[\begin{align*}
\widehat{\rho} = \frac{\sum_{i=1}^{n_{\text{sim}}} \hat{\rho}_i}{n_{\text{sim}}}
\end{align*}\] We calculated the MCSE of this correlation via bootstrapping.
Show the code
plot_rankcor <- sr_edit |># if mcse is missing, set to 0mutate(mcse =ifelse(is.na(mcse), 0, mcse)) |># uselessly used temp and cont instead of beta and pcor heremutate(outcome =case_when( outcome =="temp"~"Temporal", outcome =="cont"~"Contemporaneous" )) |>mutate(outcome =factor(outcome, levels =c("Temporal", "Contemporaneous"))) |>filter(pm =="rankcor") |>ggplot(aes(x = method, y = mean, colour = method )) +# add vertical line between methodsgeom_vline(colour ="#F3F4F5", xintercept =seq(1.5, 4, 1))+geom_errorbar(aes(ymin = mean -1*mcse,ymax = mean +1*mcse),width = .8,position =position_dodge(0.7),show.legend =FALSE)+geom_point(position =position_dodge(0.7), size =1.2) + ggh4x::facet_nested(n_id ~ outcome + n_tp,axes ="all",remove_labels ="y") +scale_x_discrete(guide =guide_axis(angle =90))+scale_y_continuous(expand =c(0, 0), limits =c(0,1.1)) +scale_color_manual(values = meth_colors) +theme_centrality() +theme(legend.position ="none",text =element_text(size =22))+labs(title ="",x ="Method",colour ="Method",y ="Centrality Rank Correlation")ggsave("plot_rankcor.pdf", plot_rankcor, height =12, width =16,path = here::here("figures/"), device ="pdf")plot_rankcor
9 Regression
9.1 Power of Regression
9.1.1 In- and Outstrength combined
Prep data
Show the code
# Split data set into three based on true effectpower_df <- sr_edit |>filter(!str_detect(pm, "poweroneside")) |>mutate(pm =str_replace(pm, "powertwoside", "power")) |>filter(str_detect(pm, "power")) |>mutate(outcome =case_when(# outcome == "tempdens" ~ "Temporal\nDensity",# outcome == "contdens" ~ "Contemporaneous\nDensity", outcome =="outstrength"~"Temporal\nOutstrength", outcome =="strength"~"Contemporaneous\nStrength", outcome =="instrength"~"Temporal\nInstrength" )) |># For now, remove contemporaneous strengthfilter(outcome !="Contemporaneous\nStrength") |>mutate(n_id =paste0("n = ", n_id)) |>mutate(n_id =factor(n_id, levels =c("n = 75", "n = 200"))) |>mutate(n_tp =paste0("t = ", n_tp)) |>mutate(n_tp =factor(n_tp, levels =c("t = 60", "t = 120"))) |># split pm into two columns, take last number into new columnseparate_wider_delim(pm, delim ="reg", names =c("pm", "true_effect"))power_list <-split(power_df, power_df$true_effect)
Function to create the plots:
Show the code
# Function to create the plot for a given dataframepower_plot <-function(df) {ggplot(df, aes(x = n_tp, y = mean, colour = method,fill = method,group = method)) +geom_errorbar(aes(ymin = mean -1* mcse,ymax = mean +1* mcse),width =0.5,position =position_dodge(0.7),show.legend =FALSE) +# add verticalgeom_vline(colour ="#F3F4F5", xintercept =seq(1.5, 4, 1))+geom_point(position =position_dodge(0.7), size =1.4) + ggh4x::facet_nested(n_id ~ outcome,axes ="all",remove_labels ="y") +# scale_x_discrete(guide = guide_axis(angle = 90)) +scale_y_continuous(expand =c(0, 0), limits =c(-0.1, 1.1)) +scale_color_manual(values = meth_colors) +theme_centrality() +theme(legend.position ="none",strip.text.x.top = ggplot2::element_text(size =rel(0.85)),ggh4x.facet.nestline =element_line(colour ="#6d6d6e"),axis.text =element_text(size =18)) +labs(title ="",x ="",colour ="",fill ="",group ="",y ="Detection Rate")}
Here, we investigate the variability of point estimates in specific simulation conditions, for which we need the raw results - this part is therefore not reproducible.
sim_results |>select(contains("divtrans")) |>rename_with(~str_remove(., "bmlvar_diagnostics."), everything()) |> knitr::kable(col.names =c("Mean Number of DivTrans","Sum of DivTrans","Models with Divtrans"))
Mean Number of DivTrans
Sum of DivTrans
Models with Divtrans
4.390
878
100
0.005
1
1
1.310
262
26
0.000
0
0
11 Session Info
Please note that the visualizations were redone in an updated R version, but the computations of the simulation were run in a previous R version as indicated in the Dockerfile.